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Abstract — It is well known that ideal free-space optical commu- 
nication at the quantum limit can have unbounded photon infor- 
mation efficiency (PIE), measured in bits per photon. High PIE 
comes at a price of low dimensional information efficiency (DIE), 
measured in bits per spatio-temporal-polarization mode. If only 
temporal modes are used, then DIE translates directly to band- 
width efficiency. In this paper, the DIE vs. PIE tradeoffs for 
known modulations and receiver structures are compared to 
the ultimate quantum limit, and analytic approximations are 
found in the limit of high PIE. This analysis shows that known 
structures fall short of the maximum attainable DIE by a factor 
that increases linearly with PIE for high PIE. 

The capacity of the Dolinar receiver is derived for binary 
coherent-state modulations and computed for the case of on- 
off keying (OOK). The DIE vs. PIE tradeoff for this case is 
improved only slightly compared to OOK with photon counting. 
An adaptive rule is derived for an additive local oscillator that 
maximizes the mutual information between a receiver and a 
transmitter that selects from a set of coherent states. For binary 
phase-shift keying (BPSK), this is shown to be equivalent to the 
operation of the Dolinar receiver. 

The Dolinar receiver is extended to make adaptive mea- 
surements on a coded sequence of coherent state symbols. 
Information from previous measurements is used to adjust the 
a priori probabilities of the next symbols. The adaptive Dolinar 
receiver does not improve the DIE vs. PIE tradeoff compared to 
independent transmission and Dolinar reception of each symbol. 

I. Introduction 

An optical communication system can be represented by the 
block diagram shown in Figure [T] At the transmitter structured 
redundancy is introduced into message bits (encoding), such 
that the receiver can correct errors introduced during com- 
munication. The encoded bits are then mapped into states of 
optical fields (modulation). These optical states are typically 
coherent states, which are produced by lasers; more generally, 
they could be thermal states or other more exotic quantum 
mechanical states. The optical fields convey the information 
through the communication medium (the channel), and map to 
(a possibly different set of) states at the receiver. For example, 
in ideal free-space communication, a coherent state is simply 
attenuated via propagation but otherwise unaltered, whereas in 
a turbulent channel it maps to a mixed state that is no longer 
a coherent state. The receiver performs a measurement on 
the received optical fields (demodulation or detection) and the 
measurement outcomes are then used to generate an estimate 
of the message that was transmitted (decoding). 
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Fig. 1. Block diagram of an optical communication system. The transmitter 
performs encoding and modulation to map information bits into optical field 
states. The propagation maps these transmitted states into received states. The 
receiver performs a measurement on the received states and the outcomes are 
decoded to estimate the transmitted message. 



In the classical formulation of information theory, the max- 
imum rate of reliable communication is determined by maxi- 
mizing Shannon's well-known mutual information metric |T|, 
j2J . In performing this maximization the channel is represented 
as a probability map, specifying the probability of a particular 
measurement outcome given the transmitted value. Thus, the 
mapping from bits to optical fields at the transmitter and the 
reverse mapping via the measurement at the receiver are part 
of the channel. Because light fields are fundamentally quantum 
mechanical, the probabilistic mapping defining the channel is 
determined by the choice of measurement. For example, an 
ideal coherent- state measured with a photon-counting detector 
yields Poisson statistics with mean proportional to the incident 
power, whereas the same state measured with a homodyne 
receiver yields Gaussian statistics with mean equal to the field 
amplitude and variance 1/4 ||3j, ||7|. Hence, to find the highest 
rate of communication for a given modulation scheme, one 
must perform the daunting task of maximizing the Shannon in- 
formation over all possible measurements at the receiver. In the 
quantum-mechanical formulation of information theory, the 
highest rate of reliable communication is given by the Holevo 
information metric Q-Q, which implicitly maximizes the 
Shannon mutual information over all measurement schemes. 

The maximum rate of transmitting information reliably over 
a communication link is its capacity, C, which is commonly 
measured as a channel throughput in units of bits per channel 
use. It is also common to characterize the link's capability 
in terms of its efficiency in utilizing link resources |8|. The 
photon information efficiency (PIE), denoted by Cp, measures 
the efficiency of information transfer per unit energy; it is 



given by: 
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where E is an average photon number constraint imposed 
on the signal at the transmitter or receiver. The dimensional 
information efficiency (DIE), denoted by Cd, measures the 
efficiency of information transfer per unit dimension of the 
optical signal; it is given by: 

C 

Cd = — (bits per dimension), (2) 

where D is the number of dimensions utilized per use of 
the channel. These dimensions can be any degree of freedom 
afforded to an optical communication link, namely temporal, 
spatial or polarization. In these expressions, the numbers of 
photons E and dimensions D are normalized per use of the 
channel, as is the capacity C. 

As an example, consider a free-space (i.e., vacuum) optical 
communication link over a distance L, utilizing transmitter 
and receiver apertures having unobscured areas At and Aji, 
respectively (Figure [2]). Let A denote the center wavelength 
of the optical fields, T the duration of each optical pulse 
transmitted over the channel, and B the total bandwidth 
occupied by the end-to-end system. The maximum number 
of independent temporal dimensions that could be utilized 
by this link for every use of the channel is approximately 
BT, and the maximum number of independent spatial di- 
mensions is approximately AtAii/{XL)'^ jSj, |[9|. If two 
independent polarizations are also utilized, then the maximum 
number of independent dimensions is approximately D = 
2BTAtAji/ (AL)^, which can be thought of as the number of 
vector- valued spatio-temporal basis functions that are required 
to represent an arbitrary function in the (temporal, spatial, 
polarization) subspace determined by the communication link 
parameters. 
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Fig. 2. The paraxial optical communication geometry, with time-bandwidth 
product BT and Fresnel number product AtAji/{XL)^, where A is the 
center wavelength. 



II. Fundamental Capacity Efficiency Tradeoffs 

Practical laser systems typically generate coherent states 
using a convenient modulation such as pulse-position modu- 
lation (PPM), on-off keying (OOK), binary phase- shift keying 
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Fig. 3. Theoretical limits on the best possible tradeoff between photon 
information efficiency (bits/photon) and dimensional information efficiency 
(bits/dimension or bits/mode). 



(BPSK), or higher-order phase and/or amplitude modulations 
such as quadrature phase- shift keying (QPSK) and quadrature 
amplitude modulation (QAM). It is well known that photon 
efficiency is unbounded in the quantum noise limit, and that 
this is achievable with a conventional system using PPM 
and photon counting. But high photon efficiency comes at a 
price of exponentially decreasing dimensional efficiency. The 
ultimate Holevo limit is better than the Shannon limits based 
on PPM and other known strategies, but it still imposes harsh 
limitations on the dimensional efficiency that can be achieved 
at high photon efficiency. 

Our aim in this paper is to understand the gap between 
the Holevo limit and the Shannon limits of known systems, 
and to determine the characteristics of new systems that might 
bridge this gap. Figure |3] shows the fundamental DIE vs. PIE 
capacity tradeoffs for free- space optical communication under 
an average power constraint, under various assumptions about 
the optical modulation and detection. All but four of the curves 
presented in Fig. [3] come from formulas stated in | [T0| , | [TT| 
and originally derived in references cited therein. The curve 
labeled "PPM + photon counting" in Fig. ISlis based on an 
analytic approximation derived in Section |lll| of this paper, 
and it replaces a similar tradeoff curve computed numerically 
in | |10J , yjj. The curve labeled "OOK + Dolinar receiver" 
is new, and it comes from a capacity derivation for a general 
Dolinar receiver structure in Section IV Fig. |3] also shows two 
simple but accurate approximations (denoted as "asymptotic 
approx #1" and "asymptotic approx #2") that are derived in 



Section III based on asymptotic analysis for high PIE of two 
of the tradeoff curves in Fig. |3] 

A. The Ultimate Quantum Limit 

The ultimate quantum limit (outermost gold curve) is an 
immutable upper bound on the best possible tradeoff between 
photon efficiency and dimensional efficiency. This limit can 
only be achieved with quantum-optimal modulation, detection. 



coding and decoding. For single-mode communication the 
dimensional efficiency (in bits/mode) on the vertical axis is 
the same as the spectral efficiency (in bits/sec/Hz). The upper 
limit on the spectral efficiency of a multi-mode system scales 
in proportion to the number of independent non-temporal 
modes. The ultimate quantum limit is achievable with optimal 
coherent- state modulation |[T2j. 

B. Limits with Constrained Modulation and/or Detection 

Fig. |3] also shows the (inferior) tradeoff limits obtained 
by restricting the modulation and/or detection to sub-optimal 
forms. Heterodyne or homodyne receivers can be used with 
arbitrary coherent- state modulation, and such receivers teamed 
with high-order modulations can achieve high dimensional 
efficiency (upper left comer of graph). However, these re- 
ceivers encounter brick- wall limits on their achievable photon 
efficiencies of log2 e ^ 1.44 bits/photon (1 nat/photon) and 
2 log2 e ^ 2.89 bits/photon (2 nats/photon), respectively. 

Applications that require high photon efficiency typically 
use PPM or OOK modulation combined with a photon count- 
ing receiver. There is no upper limit on photon efficiency 
achievable with PPM or OOK modulation and photon count- 
ing. However, the maximum dimensional efficiency achievable 
by such systems operating at around 10 bits/photon is more 
than an order of magnitude lower than that achievable at the 
ultimate quantum limit. The tradeoff limits for PPM are strictly 
inferior to those for OOK even though PPM is regarded as 
a high-order modulation while OOK is binary. The reason 
is that the dimensional efficiency of PPM is determined by 
the number of slots, and within each slot PPM is simply a 
constrained form of OOK. At high photon efficiencies such as 
10 bits/photon, this performance difference between PPM and 
OOK is very small. 

Binary modulations such as OOK and BPSK cannot achieve 
dimensional efficiency greater than 1 bit/dimension, and the 
corresponding limit for PPM is only 1/2 bit/dimension. How- 
ever, the focus of this paper is in the high photon efficiency 
region. At 10 bits/photon, BPSK and OOK modulations are 
only slightly sub-optimal with respect to the ultimate quantum 
limit if they could be teamed with quantum-optimal ultimate 
receivers. 

The Dolinar receiver |T3|, |fT4| is a quantum-optimal re- 
ceiver for distinguishing arbitrary binary coherent states with 
the objective of minimizing the uncoded error probability. 
When teamed with OOK modulation, its tradeoff curve of 
dimensional and photon efficiencies is uniformly but insignif- 
icantly better than the corresponding tradeoff achieved by 
photon counting. When teamed with BPSK modulation, the 
Dolinar receiver runs into the same brick- wall limit on photon 
efficiency (2.89 bits/photon) encountered by homodyning. 

III. Analysis of the Capacity Efficiency Tradeoffs 

In this section, we derive the closed-form approximation 
presented in Fig. [3] for the DIE vs. PIE tradeoff curve attained 
by PPM modulation and photon counting, and we examine 
its asymptotic behavior for high PIE compared to that of 



the ultimate Holevo limit. The resulting asymptotic formulas 
give rise to the two simple analytic approximations that were 
plotted in Fig. |3] alongside the true capacity tradeoff curves. 

A. Capacity of PPM and photon counting 

For a realizable system using M-ary PPM and photon 
counting, the capacity C per PPM symbol is very simple: 

(3) 



C = (l-e-^)log2M 

where E is the average energy (number of photons) used in 
the one slot that contains the PPM pulse. In this case, the 
photon efficiency 
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log2 M 



(4) 



increases without bound for large M and fixed E. But the 
dimensional efficiency is normalized by the number of dimen- 
sions (slots) M, 



" ' M 



(5) 



and this goes to zero for large M. The tradeoff between these 
two capacity efficiencies is typically studied by evaluating 
these expressions parametrically as a function of E for various 
fixed values of M. A different Cd versus Cp tradeoff curve is 
obtained for each M, and the optimum tradeoff is the outer 
envelope of all such curves. 

Better insight into the behavior of the optimum tradeoff is 
obtained by solving for the optimum M = M* as if real 
values of M* were allowed. This method yields an upper 
bound that is also an extremely good approximation. To do 
this, we use ^ and ([5]) to compute 

log2 Q + = \og2{Ecp) - log2 M + 

= log2(^C^) + (^1 - ^^^_e ) 

Maximizing for a given Cp is equivalent to maximizing 
log2 Cd + Cp for given Cp, but the latter expression is more 
convenient for optimization, because only E remains as a free 
parameter, with M having been eliminated. Differentiating the 
right side of ([6]) with respect to E and setting the derivative to 
zero 8itE = E* yields an equation that can be solved explicitly 
for Cp in terms of the optimizing value of E*: 
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(£;* ln2) [1 - (1 + £;*)e-^*] 



(7) 



The corresponding optimum M = M* (allowed to be real- 
valued) is given by: 

-E* 



log2 M* 



E* 
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1 



1-e-^* ^ 1- (l + £;*)e-^* ln2 
and the resulting dimensional efficiency is: 

(i-e-"')%xp(- ^_j-:;'_.. ) 



Cd 



(ln2) [l-(l + S*)e-s*] 



(8) 



(9) 



This expression ^ for the dimensional efficiency Cd and the 
previous expression ^ for the photon efficiency Cp together 
specify the optimum Cd versus Cp tradeoff curve parametrically 
in terms of 

B. Asymptotic capacity of PPM and photon counting 

In the parametric expressions ^ and ^ for Cp and Cd, large 
values of PIE correspond to small values of As 0, 
we have 

E*c, ^ (10) 



and 



In 2 



(11) 



From these two expressions we see that the optimum PPM 
order M* increases exponentially with increasing Cp'. 



Finally we see from ([6]) that 

log2 Cd^Cp^ log2 
= log2 



(12) 



ln2y ln2 
2 



e ln2 



0.086 (13) 



Therefore, our analytic approximation to the optimum Cd 
versus Cp tradeoff curve behaves asymptotically as: 



e ln2 



2"^^ ^ 1.061 X 2"^^ (14) 



Thus, with PPM and photon counting, the dimensional infor- 
mation efficiency must fall off exponentially with increasing 
photon information efficiency. 

C. Asymptotic ultimate Holevo capacity 

At the ultimate Holevo limit, the dimensional and photon 
efficiencies are given by: 

Cd = {E^ 1) log2(^ + 1) - ^log2 E 

Cp = Cd/E (15) 

where E is the average number of photons per dimension. For 
this case we compute: 

1 ( I (E+l)log2(^+l) ^..^ 
\0g2{Cd/Cp) -^Cp= (16) 



As £; ^ 0, 



E 



\og2{cd/cp) +Cp ^ log2e 



(17) 



So asymptotically for large Cp, the DIE vs. PIE tradeoff curve 
at the ultimate Holevo limit is given by: 



Cd c 



(18) 



Even at the ultimate limit, the dimensional efficiency Cd must 
fall off exponentially with increasing photon efficiency Cp, 
except for a multiplicative factor proportional to Cp. 



D. Usefulness of the asymptotic approximations 

The asymptotic expressions c^^^ and c^^^ are numerically ac- 
curate relative to the dimensional information efficiencies from 
which they were derived, for high enough photon information 
efficiency. The approximation c^^^ overestimates the dimen- 
sional efficiency achieved with "PPM + photon counting" by 
less than 10% for photon efficiencies above 4 bits/photon. The 
approximation underestimates the dimensional efficiency 
achieved at the ultimate Holevo bound by less than 10% for 
photon efficiencies above 4 bits/photon. 

The asymptotic expressions c^^^ and c^^^ also serve as 
decently accurate approximations to the dimensional informa- 
tion efficiencies for several of the other cases considered in 
Fig.bl The approximation c^^^ underestimates the dimensional 
efficiency achieved with "OOK + photon counting" or "OOK 
+ Dolinar receiver" by less than 10% for photon efficiencies 
above 10 bits/photon. The approximation ^ overestimates 
the dimensional efficiency achievable at the Holevo limit for 
the two cases of constrained modulation. For the "BPSK + 
ultimate receiver" curve, this overestimate is within 10% of 
the true curve for photon efficiencies above 7 bits/photon. For 
the "OOK + ultimate receiver" curve, it is within 10% of the 
true curve for photon efficiencies above 16 bits/photon. 

Comparing the asymptotic approximations for the ultimate 
Holevo capacity and the capacity of PPM with photon count- 
ing, we obtain: 



(ultimate Holevo) 
Cc^(PPM + counting) 
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: 2.561c 
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(19) 

This expression gives a good approximation, for high photon 
efficiency, to the best possible factor by which the dimensional 
efficiency can be improved by replacing a conventional system 
employing PPM and photon counting with one that reaches the 
ultimate Holevo limit. This improvement factor is only linear 
in the photon efficiency. 

IV. Capacity Efficiency of the Dolinar Receiver 

In pOJ there was a perplexing result that the Dolinar re- 
ceiver, despite being precisely optimal for the binary coherent 
state detection problem, yielded a DIE vs. PIE tradeoff that 
was markedly inferior to the optimal tradeoffs obtained with 
photon counting of either PPM or OOK. In particular, the 
photon information efficiency of the Dolinar receiver appears 
to hit the same brick wall limit of 2.89 bits/photon as that of 
a homodyne receiver. However, that conclusion results from 
imposing a constraint of equal a priori probabilities on the 
operation of the Dolinar receiver. In this section, we show 
that, when that constraint is lifted, the capacity attained by 
the Dolinar receiver with OOK is actually better than that 
achieved with photon counting, but the improvement is tiny. 

A. Capacity for arbitrary binary coherent states 

When a canonical Dolinar receiver is used to optimally 
distinguish between two equally likely coherent states with 
minimum probability of error, the a posteriori probability 



that the receiver's observations favor the true state evolves 
monotonically upward with observation time as additional 
received energy is accumulated. When the same receiver 
structure is used to optimally distinguish between two coherent 
states that are not equally likely a priori, the Dolinar receiver 
for this case operates exactly as the canonical Dolinar receiver 
for equally likely states, except that it starts from the point 
at which the canonical receiver's a posteriori probability has 
evolved to the point where it equals the a priori probability 
of the more likely state. 

When a Dolinar receiver measurement is used to distinguish 
two states with equal a priori probabilities, the associated 
channel is a binary symmetric channel (BSC) with crossover 
probability equal to the receiver's error probability for the 
corresponding binary detection problem. This is the case for 
which the formulas and curves presented in pO| are applicable. 
When a Dolinar receiver measurement is used to distinguish 
two states with unequal a priori probabilities, the associated 
channel may still be considered binary, but it is no longer 
symmetric. In this case, the possible channel outputs divide 
into two categories!^ as shown in Figure |4j corresponding to 
even and odd numbers of observed photon counts, and the 
conditional probabilities of observing even or odd counts are 
different for the two channel inputs. 

The conditional probabilities of obtaining even counts were 
derived in eqs. (6.34) and (6.35) of |14|, but here we will 
express them more conveniently. Let |?/^o) and denote 
the two coherent states, and let s = \{i^o\i^i)f denote their 
overlap (squared inner product). If ^ denotes the a priori prob- 
ability of the less likely state, then the conditional probabilities 
of getting even and odd numbers of counts can be written as: 
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(20) 



where the conditioning events corresponding to the more 
probable and less probable states are denoted by the super- 
scripts + and — , respectively. The probability of error achieved 
by this receiver is obtained by averaging the two crossover 
probabilities in Fig. |4j 

Pe = Cn;e„ + (1 - OP^,, (21) 

This error probability achieves the quantum limit for distin- 
guishing between two states, called the Helstrom bound [ |16j |: 



(22) 



^No additional mutual information is obtained by resolving t he po ssible 
channel outputs into more than these two categories; see Section I V-A| 
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Fig. 4. Binary asymmetric channel obtained by applying a Dolinar receiver 
measurement to binary coherent states with arbitrary a priori probabilities. 



In the special case when ^ = 1/2, these four conditional 
probabilities characterize a BSC with crossover probability 
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(23) 



In the more general case, the mutual information I{X;Y) 
conveyed across the asymmetric channel shown in Fig. |4]is: 

/(X; Y) = i72(Podd) - [iH2{P-J + (1 - 0^2(Podd)] 

(24) 

where H2{x) = —x log2 x — {1 — x) log2(l — x) is the binary 
entropy function, and Podd is the unconditional probability of 
observing odd counts: 

i^-^P^M 



odd 



iPc 



odd 



(25) 



The mutual information expression ([24]) is valid for the channel 
obtained by applying the Dolinar receiver measurement to any 
pair of coherent states. However, this channel is peculiar, be- 
cause its crossover probabilities are dependent on the a priori 
probabilities of its inputs. The maximum mutual information 
per channel use is attained for ^ = 1/2, and this yields the 
capacity per channel use of a BSC as stated in pO|: 



C = 1 - H2{Pe) (bits per channel use) (26) 

For BPSK-modulated coherent states, the maximum mutual 
information per photon is also attained for ^ = 1/2, because 
in this case the two states have equal energy. 

B. Capacity for OOK modulation 

If the two coherent states are not equally energetic, the 
maximum mutual information per photon is generally attained 
for ^ < 1/2, where ^ is the a priori probability of the more 
energetic state. OOK modulation produces the most disparate 
state energies. In this case, the overlap between the "ON" 
and "OFF" states is 5 = e~^l^, where ^/^ is the average 
number of photons in the "ON" pulse. The optimum channel 
input probability ^ for maximizing mutual information per 
photon approaches in the limit of large photon information 
efficiency. Closed-form expressions are not available. The DIE 
vs. PIE tradeoff curve for this case in Fig. [3] was obtained by 
maximizing (24) numerically over ^ for given E to obtain 
C[E\ then varying E parametrically to trace out = C{E) 
vs. Cp = C{E)/E. This curve is uniformly better than the 
corresponding DIE vs. PIE tradeoff curve obtained for OOK 
modulation with photon counting, but only by a minuscule 
amount. 
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Fig. 5. The general structure of an optical feedback receiver, which modifies 
a local oscillator field aio(t) as a function of the photodetector output N(t). 
The orange thicker lines indicate optical fields and the thinner black lines 
represent electrical signals. 



V. Sense of Optimality of the Dolinar Receiver 

The Dolinar receiver is known to be the optimal hard- 
decision measurement on an arbitrary binary coherent- state 
alphabet. In this section, we show that it is also an optimal soft- 
decision receiver, at least for BPSK, in the sense of maximizing 
the mutual information obtained from the measurement, within 
a class of causal optical feedback receivers. For BPSK, we 
also show that the Dolinar receiver achieves the capacity of 
an ultimate receiver for a certain reciprocal channel with a 
complementary signaling geometry. 

A. Optimality with respect to mutual information 

Consider the optical detection system in Fig. [5] in which the 
received optical field is first displaced by an ideal coherent- 
state local oscillator, and subsequently observed with an ideal 
(unity quantum efficiency, no dark current) photon-counting 
photodetector. We assume that the incoming signal is constant 
over t G (0,T], and its value is randomly selected from an 
alphabet of JC coherent states, {\ak)jk = 1,2,...,/C}, with 
a priori probabilities {pkj k = 1, 2, . . . , /C}. Therefore, the 
output of the photodetector is a conditionally Poisson counting 
process N{t), with rate X{t) = \ak + aio(^)P for t G (0,T]. 
We ignore bandwidth and dynamic range limitations in the 
feedback path and assume that the local oscillator field (am- 
plitude and phase) can be varied instantaneously, based on the 
counting process output from the photodetector. The objective 
function to be maximized by the local oscillator is the mutual 
information between the random variable k and the counting 
process N{t) over the time window [0,T]. In particular, the 
objective is to maximize 



I{K; {N{t) : < t < T}) , 



(27) 



by choosing the local oscillator field aio{t) using the only 
information that the receiver has, i.e., the photon count times 
observed in A/'(t)|^We assume that N{0) =0 with probabil- 
ity 1, with no loss of generality. 

We will first prove that maximizing the global mutual infor- 
mation given in pT] ) is equivalent to incrementally maximizing 
the mutual information in the time window + At], as 
At 0, conditioned on the observations up to and including 



time t. Let us begin by expressing ( [27] ) as the following limit, 

I{K; {N{t) :0<t<T})= Jm/(i^; {A^m' if^^^^ ) , 

(28) 

where Nm = N{{m + 1)AT) - 7V(mAT), and {TV^'lf = 
{A/^ , A/^+i, . . . , Njn} for j < m. We can now use the chain 
rule for mutual information to expand the argument of the 
limit as 



[T/ATJ 



[T/AT] 



)= E 



AT 



m=0 



/(i^;iV^|{iV^.}^ 
AT 



(29) 

The statistical regularity of a Poisson process ensures that the 
second term inside the summation converges as AT converges 
to 0. Hence, substituting ( |29| ) into ([28]), and recognizing that 
the local oscillator field aio(t) is a (causal) function of the 
counting process realization, we obtain 



max/(i^;7V(t) : < t < T) = [ 

aio(^) Jo 



lim 

AT^O 



dt En 

i{K;N{t^AT)-N{t)\{n{r):re{0,t]}) 



max 



AT 



(30) 



where n{t) denotes a particular realization of the photodetector 
output N{t), and i{X;Y\z) is the mutual information between 
X and Y, conditioned on a particular realization of the 
random variable Z [^Equation ( [3U] ) shows that the optimal local 
oscillator that maximizes the mutual information between the 
input symbol k and the photodetector output over < t <T 
can be chosen at each time instant t, such that it incrementally 
maximizes the conditional mutual information at the next time 
instant, given the observations up to and including t. 

Next, we turn our attention to the numerator of the limit in 
( [30] ), given by 

i{K; N{t + AT) - N{t)\{n{r); r G (0, t]}) = 

H{N{t + AT) - N{t)\{n{r); r G (0, t]}) 
- H{N{t + AT) - N{t)\K, {n(r); r G (0, t]}) , (31) 

where i^(-) is the well-known (discrete) entropy function ||2|. 
Because N{t) is a Poisson process when conditioned on K 
(and aio(t)), N{t + AT) - N{t) is a conditionally-Poisson 
random variable with mean approximately AT\aio{t) + a/cp, 
and a compound Poisson random variable when conditioned 
on aio(t) alone. Using tight bounds on the entropy of Poisson 
and compound Poisson random variables p7| , it can be shown 
that 



lim 

AT^O 



i{K; N{t + AT) - N{t)\{n{r);r e (0, t]}) 



AT 



K 



-A log A + ^ pkXk log \k , (32) 



^Note that the local oscillator field must depend on the counting process 
causally. 



^Note that the conditional mutual information is given by I{X]Y\Z) = 
Ez[i(X;Y\z)] |2J. 



where Xk = \(^io{t) + <^/cP for k G {1, 2, . . . , /C}, A = 
"^ZkPk^k, and we have suppressed the time-dependence of 
the variables to avoid clutter. It is worthwhile to point out that 
the right-hand side of ([32]) is a convex function of the A/^'s and 
therefore is non-negative, as required. The right hand side of 
Eq. ([32]) is in a form that can be maximized in terms of aio. In 
general, aio = aexp(z(/)) is complex, thus the maximization 
must be carried out over both the amplitude and the phase. 
The optimal solution must therefore satisfy, 

A/c' 



X^P/c log 



A 



0, 



(33) 



(34) 



where = Zak. The solution to these equations for general 
{a/e} is nontrivial. It can be noted, however, that if all ak have 
common phase (up to a tt phase shift), then the local oscillator 
is also either in phase or out-of-phase with the constellation. 

In particular, in a BPSK scenario with a constellation 
{| — a), I a)}, the solution to these equations can be found 
analytically, and perhaps unsurprisingly, the optimal local 
oscillator turns out to be identical to the local oscillator that 
minimizes the probability of error in classifying the signal 
state, i.e., the Dolinar receiver | [T3| . This result shows that, for 
BPSK at least, the Dolinar receiver is the optimal soft-decision 
receiver within the class of causal optical feedback receivers. 
But this does not imply that higher mutual information cannot 
be achieved with a receiver outside of this class. 

B. Reciprocal channel optimality 

For BPSK modulation, the capacity (bits) achieved by the 
Dolinar receiver is given in terms of the energy E (photons) 
of each BPSK-modulated coherent state |?/^o). IV^i)- 



Dol+BPSK 



= 1 - Q (l - (35) 

where i^2(-) is the binary entropy function. The Holevo 
capacity for BPSK-modulated coherent states is: 



C'hoI+bpsk = H2[-{l — e 



(36) 



Let s denote the overlap between the two BPSK-modulated 
coherent states. 



2 _ -AE 



s KV^olV^i)! = e 



(for BPSK) (37) 



Writing the two capacity expressions in terms of 5, we see 
that: 

-(1- vT^ 

2^ 



and 



Cho1+bpsk(5) = H2 - ^/s) 



Thus, we have the result that: 

C'ho1+bpsk(5) + Cdo1+bpsk(1 - s) = l bit (40) 



In other words, the ultimate receiver and the Dolinar receiver 
for BPSK are "complementary" or "reciprocal" in that the total 
potential capacity of any binary modulation (1 bit) is obtain- 
able as the sum of the capacities of the two receivers given 
complementary signaling geometries, one with |(V^o|V^i)| = s 
and one with |(V^o|V^i)|^ = 1 — 5. This intriguing property 
is exact, not an approximation. It is similar in concept to a 
reciprocal channel relation between variable and check node 
operations in decoders for low-density parity-check (LDPC) 
codes, that is used to accurately approximate density evolu- 
tion 1 18|. But its significance in the current context is unclear. 

VI. Dolinar receiver with adaptive priors 

The previous two sections focused on computing the ca- 
pacity of the Dolinar receiver applied to a single symbol 
represented by one of two coherent states. In this section, we 
extend the Dolinar receiver to make adaptive measurements 
on a coded sequence of coherent state symbols. Information 
from previous measurements is used to adjust the a priori 
probabilities of the next symbols. 

A. Definition of the adaptive Dolinar receiver 

Consider a communication channel with binary modulation. 
The transmitter puts each quantum mode into either state iV^o) 
or state The overlap of the two states is |(V^o|V^i)|^ = 
When 5 7^ 0, the two states are not orthogonal, and therefore 
are not perfectly distinguishable. We will consider a channel 
which has no effect on the transmitted states (the identity 
map channel), so that the receiver may perform measurements 
directly on the transmitted states. 

Consider a receiver that yields a random measurement result 
Y G {0, 1}. The receiver may be defined by two Hermitian 
operators {Mq^Mi}, where the probability of measurement 
result y given state l^^^^) is given by 



P{Y = y\X = x) = {^,\My\^,). 



(41) 



Since the two possible outcomes are collectively exhaustive, 
Mq + Ml = /, where / is the identity operator on the space 
spanned by {|V^o), 

The Dolinar^eceiver p3|- ||T5| implements a projective 
measurement (My My' = 5y^y>My) on a single optical mode. 
The measurement operators My = My(^) are functions of 
the a priori probability ^ that state {ipi) is transmitted. For 
the Dolinar receiver measurement, the probabilities of the two 
possible outcomes y, given the two possible states l^^^^), are the 
four conditional probabilities evaluated in ( [2Q| and depicted 
in Fig. [4] 



(38) 


P{Y = 


0\X 


= 0;O 


= p+ 

even 


P(Y = 


1\X 


= 0;0 






P(Y = 


o\x 


= i;0 


= P- 
even 


(39) 


P{Y = 


1\X 


= i;0 


~ ^odd 



(42) 

The Dolinar receiver measures a single mode of the optical 
field. We now consider n optical modes, which may differ 
in their spatial, temporal, or polarization dimensions. The 



transmitter prepares mode j (where j G {1,2,. ..n}) in state 
\ipxj)r where Xj G {0, 1} for binary modulation. The quantum 
state of the system may then be represented as the product 
state 

n 

(43) 



We represent the n binary digits {xj} as the vector x; 
there are 2^ possible such vectors, which we will denote as 
{x^^\x^^\ . . . Again we assume that the channel is 

an identity map, so the receiver may perform measurements 
directly on the transmitted states. 

We now describe a new receiver for the n-mode channel. 
This receiver is an extension of the Dolinar receiver which 
makes use of information received on some modes to adapt 
the measurement of other modes. The new receiver performs 
measurements of each mode sequentially and consecutively 
(the measurement of one mode is completed before the mea- 
surement of the next mode begins). This could be further 
extended to include measurements which occur in parallel on 
multiple modes, so that partial measurement results could be 
used in an adaptive algorithm; however, we do not explore 
this possibility here. The new receiver performs a Dolinar 
measurement on each mode, but with a variable parameter 

that depends on the outcome of previous measurements. 
The 2^ measurement operators are thus of the form 

n 

A1« = (g)M (o(ei), I e {0,1,... 2"} (44) 

.=1 ' 

where {^^^ , y^^^ , . . . ^^"^^ } is the set of possible measurement 
outcome sequences. These outcomes are completely exhaus- 
tive, so that 

^M^^^=I (45) 

1=1 

where X is the identity operator on the space spanned by 
the 2"^ state vectors {|^^(o)), . . . |^^(2-))}. We take 

to be a function of previous measurement results, so 

=0(^o,^i,...%-i). 

The measurement result of mode j is given by G {0, 1}, 
and Y represents the n binary digits of {Yj}. The conditional 
probability of a sequence of measurement outcomes may be 
written as 

n 

P{Y = y\X = x) = ]\P{Y, = y,\X, = x,;Q (46) 

where the conditional probability of each outcome is given by 
P{Yj = yj\Xj = Xj;Q = {^,^\My^{Q\^,^) . (47) 

and My{^) is the single-mode Dolinar operator for outcome 
y with parameter ^. 

The conditional probabilities of Y given X are completely 



ijiVOi yii " -Vj-i) is specified. We select the function 

'\^?9^'. (48) 



where the values of gy obey < ^ < 1 and are defined by 



the iterative relation 

.(0 Ai) 



where Sj is a normalization constant chosen such that 

1 = 1 

The initial values are given by the a priori probabilities of 
sending each string. 



(0 
^0 



P{X = x^^^). 
(0 



(51) 



From the receiver's perspective, ^ represents the probability 
that string x*^^^ is the transmitted string, conditioned on the 
measurement results {?/o, ^i, • • • 

If all messages x^^"^ have equal a priori probabilities, then 
equations ( [48] ), ( [49| ), and ( [5T] ) lead to the trivial feedback 
rule = 1/2, Vj. In order to investigate the performance 
of this receiver, we imposed a code constraint on the allowed 
transmitted sequences. That is, we selected 



P{X = x 



<0^ 



'1/2^ x^'^eS,od. 

f^^^^^code 



(52) 



where 5'code is a set of sequences which obey a code constraint, 
and the cardinality of ^code is 2^. The code is described as an 
(n, /c) code, since k bits of information are represented in the 
sequence of length n. 

B. Adaptive Dolinar receiver applied to small codes 

The first code we consider is the (3,2) parity check code, 
where the members of the set 5'code satisfy x^ = xi X2. We 
take the case of BPSK modulation, where the physical states 
used are optical coherent states, iV^o) = \oi) and = \ 
From ( [37| ), the overlap of the two BPSK-modulated coherent 
states is s = exp(— 4£^), where E is the average number of 
photons used in each mode. 

In Figure |6j we plot the probability of error in the third bit, 
P{Ys ^ Xs) vs. E. For comparison, we also plot the same 
quantity for a Dolinar receiver with fixed parameter ^ = 1/2. 
It is apparent that the adaptive receiver has a lower probability 
of error, particularly for higher photon numbers. 

Probability of error, however, is not the only metric of inter- 
est in the communication system. The dimensional information 
efficiency Cd and photon information efficiency Cp are given 
by 



Cd = I{X;Y)/n 
Cp = I{X;Y)/{nE) 



(53) 



described by equations (|42|i, (|46|i, and (47), once the function 



where I{X; Y) is the mutual information between the random 
binary strings X and Y. 
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Fig. 6. Probability of measurement error in the third bit P(Ys / Xs) vs. 
the average number of photons per mode E assuming BPSK modulation. The 
black dotted line indicates the performance of a Dolinar receiver with fixed 
parameter ^ = 1/2, while the solid red line is for the new adaptive receiver. 
The message constraint is the (3,2) parity check code. 
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Fig. 7. Dimensional information efficiency vs. photon information effi- 
ciency Cp for three BPSK receivers. The dashed blue line is an uncoded system 
with a Dolinar receiver. The dotted black line has the (3, 2) parity check code 
constraint and uses a Dolinar receiver with fixed parameter ^ = 1/2. The solid 
red line uses the (3, 2) parity check and the adaptive receiver. 

Figure [7] shows these quantities. For comparison, the same 
quantities are plotted for a transmitter with the (3, 2) parity 
check code constraint and a receiver without feedback (^j = 
1/2, Vj), and for a transmitter without a code constraint and 
a receiver without feedback. The adaptive receiver performs 
sHghtly better than the fixed receiver with the code constraint, 
though the gap is quite small. The large improvements in error 
rate evident in Figure [6] have only a modest effect on Cd^Cp 
since they occur at comparatively large E. In this regime, the 
mutual information for both the fixed and adaptive receiver is 
approaching its maximum value of k = 2 bits, corresponding 
to the asymptotic limit limcp^oQ k/n = 2/3. On the 
other hand, the system without the code constraint approaches 
Cd = I bit/mode due to the assumption of binary modulation. 
So in the high E limit, where Cp is small, the uncoded system 
has a higher Cd. 
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Fig. 8. Dimensional information efficiency vs. photon information effi- 
ciency Cp for three BPSK receivers. The dashed blue line is an uncoded 
system with a Dolinar receiver. The dotted black line has the (7, 4) Hamming 
code constraint and uses a Dolinar receiver with fixed parameter ^ = 1/2. 
The solid red line uses the (7, 4) Hamming code and the adaptive receiver. 

In the opposite limit, as £^ ^ 0, the mutual information 
approaches zero since the states become indistinguishable, 
and hence Cd 0. In this regime the adaptive receiver 
is approaching the same measurement strategy as the fixed 
receiver, as seen by the converging error probabilities towards 
the left of Figure |6] The Cp of all three receivers shown in 
Figure [T] appear to approach the same limit of 2.89 bits/photon. 

In order to see the dependence on the choice of code, 
we also compute Cd and Cp for the same receivers with 
the elements of ^code defined by the (7,4) Hamming code 
constraints 

^5 = Xi © X2 Xs, 
^6 = X2 X4, 

X7 = Xi ^ Xs ^ X4. (54) 

The results, shown in Figure [8] are similar to those for the 
(3, 2) parity check code. Again, the adaptive receiver slightly 
outperforms the fixed receiver given the code constraint, but 
does not exceed the efficiency of the unconstrained system. 

C. Non- improvement of capacity with adaptivity 

Let us denote C*^^^ as the single symbol measurement 
capacity of a channel, i.e., the measurement is identical from 
one symbol to the next, and it is optimized (over all possible 
single symbol measurements) to maximize the capacity. We 
define the n-symbol measurement capacity by extension as 
the capacity maximized over all possible measurements that 
can be done jointly over n symbols. We denote an n-symbol- 
spanning measurement as M^^\ where the eigenspaces of 
the operator define a probability operator- valued measurement 
(POVM), and the eigenvectors are the corresponding measure- 
ment outcomes. An n-symbol measurement is called adaptive 
if it can be represented as 

n 

= ^M\^\yo,...,yi-i), (55) 



where yo^ . . . ^yi_i indicates the past outcomes from the 
sequence of measurements. Suppose we denote the n- 
symbol adaptive measurement capacity as C^^\ It has been 
shown (191 t^^^t 

= , (56) 

i.e., the channel capacity with any single-symbol measurement 
strategy that adaptively changes to previous outcomes cannot 
perform better than the best single-symbol nonadaptive mea- 
surement. It is also known | [T2| that a single-symbol measure- 
ment is bounded away from the Holevo capacity whenever the 
density matrices at the output of the channel do not commute 
(5 7^ in our case). However, the gap between C^^^ and the 
Holevo bound is not known. Furthermore, short of a brute- 
force search, there is in general no method to find the optimal 
M^^\ Thus, there remains the possibility that an adaptive 
receiver can improve the capacity beyond the suboptimal 
measurement strategies that are currently known. Nonetheless, 
in light of the negative results that have been obtained in 
Section [Vl| we conjecture that adaptive measurements of the 
type in \55) will not significantly close the gap from known 
receivers to Holevo. Instead, we suspect that measurement 
operators will be needed which are not factorable between 
modes. 

VII. Conclusion 

We developed a closed-form parametric formula for the DIE 
vs. PIE tradeoff for a system using PPM and photon counting, 
and we presented analytic asymptotic expressions for this case 
and for the ultimate quantum limit. We showed that any system 
using PPM and ideal photon counting must fall short of the 
maximum achievable DIE by a factor that increases linearly 
with PIE for high PIE. 

We worked out the mutual information resulting from a 
Dolinar receiver measurement for general binary coherent 
states with unequal a priori probabilities, and used this result 
to numerically compute the DIE vs. PIE tradeoff for this 
receiver used with on-off keying (OOK). But this provided 
only a minuscule improvement compared to OOK with photon 
counting. We derived an adaptive rule for an additive local 
oscillator that maximizes the mutual information between a 
receiver and a transmitter that selects from a set of coherent 
states. For binary phase-shift keying (BPSK) at least, this is 
equivalent to the operation of the Dolinar receiver. For BPSK, 
we also found that the Dolinar receiver achieves the capacity 
of an ultimate receiver for a certain reciprocal channel with a 
complementary signaling geometry. 

We attempted to beat the Dolinar receiver's DIE vs. PIE 
tradeoff for single symbols by applying an adaptive version of 
this receiver to multiple coded symbols, and then adjusting 
the a priori probabilities of the parity symbols based on 
measurements of the information symbols. This approach 
yields a tiny improvement over the capacity achieved by the 
Dolinar receiver subject to the code constraint, but the capacity 
worsens relative to that of the same modulation and receiver 
without the code constraint. 
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